Dynamic Multi-Armed Bandits and Extreme Value-Based Rewards for Adaptive Operator Selection in Evolutionary Algorithms

نویسندگان

  • Álvaro Fialho
  • Luís Da Costa
  • Marc Schoenauer
  • Michèle Sebag
چکیده

The performance of many efficient algorithms critically depends on the tuning of their parameters, which on turn depends on the problem at hand, e.g., the performance of Evolutionary Algorithms critically depends on the judicious setting of the operator rates. The Adaptive Operator Selection (AOS) heuristic that is proposed here rewards each operator based on the extreme value of the fitness improvement lately incurred by this operator, and uses a Multi-Armed Bandit (MAB) selection process based on those rewards to choose which operator to apply next. This Extreme-based Multi-Armed Bandit approach is experimentally validated against the Average-based MAB method, and is shown to outperform previously published methods, whether using a classical Average-based rewarding technique or the same Extreme-based mechanism. Validation test suite includes the easy One-Max problem and a family of hard problems known as “Long k-paths”.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Operator Selection in EAs with Extreme - Dynamic Multi-Armed Bandits

The performance of evolutionary algorithms is highly affected by the selection of the variation operators to solve the problem at hand. This paper presents a brief review of the results that have been recently obtained using the “Extreme Dynamic Multi-Armed Bandit” (Ex-DMAB), a technique used to automatically select the operator to be applied between the available ones, while searching for the ...

متن کامل

Extreme Compass and Dynamic Multi-Armed Bandit for Adaptive Operator Selection

The goal of Adaptive Operator Selection is the on-line control of the choice of variation operators within Evolutionary Algorithms. The control process is based on two main components, the credit assignment, that defines the reward that will be used to evaluate the quality of an operator after it has been applied, and the operator selection mechanism, that selects one operator based on all oper...

متن کامل

On Adaptive Estimation for Dynamic Bernoulli Bandits

The multi-armed bandit (MAB) problem is a classic example of the exploration-exploitation dilemma. It is concerned with maximising the total rewards for a gambler by sequentially pulling an arm from a multi-armed slot machine where each arm is associated with a reward distribution. In static MABs, the reward distributions do not change over time, while in dynamic MABs, each arm’s reward distrib...

متن کامل

Adaptive Operator Selection for Optimization

Evolutionary Algorithms (EAs) are stochastic optimization algorithms which have already shown their efficiency on many application domains. This is achieved mainly due to the many parameters that can be defined by the user according to the problem at hand. However, the performance of EAs is very sensitive to the setting of these parameters, and there are no general guidelines for an efficient s...

متن کامل

Extreme Value Based Adaptive Operator Selection

Credit Assignment is an important ingredient of several proposals that have been made for Adaptive Operator Selection. Instead of the average fitness improvement of newborn offspring, this paper proposes to use some empirical order statistics of those improvements, arguing that rare but highly beneficial jumps matter as much or more than frequent but small improvements. An extreme value based C...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009